SUMMARY:
·
14
years of total IT experience in software analysis, design and
development for various software applications in client-server environment
with expertise in providing Business Intelligence solutions in Data
Warehousing for Decision Support Systems and SDLC.
·
Possessing
over 11 years of practical hands on experience in DW and BI Reporting
implementation.
·
Architect,
Build and Sustain Hadoop echo system environment, Map Reduce Jobs, Hive,
Hbase, Pig and NoSQLs and MongoDB.
·
Worked
on Multi clustered environment and setting up Big Data and Hadoop
echo-system.
·
Set up
and configure Hadoop system on Amazon AWS for
processing massive volumes of data.
·
Involved in Designing extraction process, Data Analysis, Requirement study, Data modeling, Performance tuning.
·
Expert
in performance tuning methods, partitioning and pushdown optimization with
large volumes of data load.
·
Provide
leadership and best practices for installation, administration, backup &
recovery, Unit test, release, management, design features and
functionality on BI Project.
·
Designed
and developed complex mapping logic from varied transformation logic
like Unconnected and Connected lookups, Router, Filter, Expression,
Aggregator, Joiner, Update Strategy, Union and more.
·
Experience
in integration of various data sources with Multiple Relational Databases
like Oracle, SFDC, Teradata, SQL Server, MS Access, DB2, and XML
System. Used Teradata Utilities like Fast Load, Multi Load and Fast
Export.
·
Maintaining
Informatica Codes, Mappings, workflows, Parameters, creating and managing
users, set up LDAP environment, troubleshooting and resolving issues.
·
Creating
appropriate indexes, usage of hints, re-building indexes and also used the
Explain Plan and SQL Tuning.
·
Analyze
business requirements and translate them into functional and technical
design specifications and implementation plans.
·
Self-starter,
result-oriented team player with an ability to manage multiple tasks
simultaneously.
·
Proven
strong analytical and logical skills with ability to follow project standards
and procedures as per client specifications.
Achievements:
·
Implemented
Cloud Base echo-system HDFS, Pig, Hive, and Map Reduce on AWS.
·
Excellent
performance in the Data Extraction, Transformation and Load Processes using
various tools including Informatica PowerCenter and achieved high level of
customer satisfaction
·
Implemented
excellent DW/DM projects for Financial and Utilities sectors.
·
Parsed
Unstructured and semi structured documents like multi tab excel sheets, XMLs
documents by implementing Complex Data Transform using Informatica
PowerCenter.
·
Provided
good documentation for the entire business processes to make better
streamline process.
Core
Skills
Hadoop
Distributions
:
Apache Hadoop, Cloudera/CDH3, CDH4, MapR and
HortonWorks
Sub Projects
:
Hive, Hbase, Pig, Flume, Zookeeper, Avro, Sqoop, Hui
Linux Flavors
:
Ubuntu 12.04.02 LTS, CentOS-6.3, RedHat Linux 6
Monitoring
tool
:
Ganglia, Cloudera Manager, Apache Hadoop Web UI
Automation
:
Puppet
Reporting and
Other
:
OBIEE 10.1.x, Qlikview, Tableau, SAP Business Object
Databases
:
Oracle 10g, Teradata, DB2, Vertica, Cassandra, MongoDB,
MySQL.
ETL
Tools
:
Informatica Power Center 9.0.1, Data Quality
Database
Tools
:
Oracle SQL*Plus, SQL, PL/SQL, Navigator, Toad 7.4, Teradata SQL
Languages
:
SQL, UNIX Shell Script (Bash, ksh), Python
Technical Certification and Education
B.S.
(Mathematics):
Pandit Ravi Shankar Shukla University, India
PGDCA
(A Level):
National Institute of Electronics and Information Technology (Formerly Known
as DOEACC Society), New Delhi, India
DCS
(Computers):
APTECH Computer Education, India.
Certification-
Oracle Certification
in Oracle 9i SQL.
Informatica
Certified Designer in Informatica Power Center 6.2.
Informatica Power
Center 8.1 Trained Professional from Informatica Corporation.
Project
Experiences
Client: Gilead
Sciences
Oct 11 – Till Date
Sr Software Consultant – ETL and BigData
Description:
Implementation Partner Systems wanted to
get analytic information on Products, services and contracts. Typical reports
need to generate from database like contract terms, service Contracts and
forecast analysis. Work with massive amount of data to create next generation
reporting and Analytics System and implemented sustainable big data/
Hadoop environment.
I am Responsible for
–
·
Architect,
Build and maintain sustainable Hadoop echo System and sub projects
such as HDFS, MapReduce, Hbase and pig for data store and reporting for the
business requirements.
·
As
System Admin, worked on various sctivity along with analysis, design and
enhancement of existing system.
·
Evaluate
different ways to visualization data and store in best possible way to
sustain HDFS environment.
·
Build
Platform for inmemory reporting system to report hdfs and respective
sub projects,
·
Worked
on Monitoring cluster, plan for backup-recovery and desaster Planing and
other Administration activity.
·
Create,
update and maintain project documents including business requirements,
functional and non-functional requirements, functional design, Big Data etc.
·
Eliminated
bottleneck in MapReduce Jobs Processed Enterprise wide High Volume
data.
·
Develope
best practice methodology and documentation to support in different aspect of
project.
·
Install,
Configure and maintain Hadoop on Multi clustered environment and worked with
MapReduce, Hbase, Hive, Pig and PigLatin, Flume, Zookeeper etc.
·
Install
and maintain monitoring tools like Gangila Guide, mentor and develop others
working on MapReduce and Hbase roles.
·
Creating
different Applicate and Metrics to analyze data for business users like
Service Attach metric, Multi year etc.
·
Worked
as Point of contact for all BigData Implementations.
Environment: HDFS, Cloudera
Hadoop Admin/CDH4, Shell Script, Map Reduce, Hbase, Pig, Flume, Oozie,
Puppet, Qlikview, and Tableau.
Client: Risk
Management Solutions, Inc. Newark,
CA
June 11 – Oct 11
Lead Consultant – ETL and Hadoop
Implementation
Description:
RMS is working to implement new system and
interface for their product Risk Link. By being data efficient, fast as real
life, open to the best, and in the cloud, customers will realize significant
improvements in data management, automation and operating efficiencies so
more time and resources can be engaged in creating risk intelligence and
disseminating insight. This data is used by insurance and reinsurance
industries do assess the damage and act accordingly.
I was Responsible for -
·
Started
Cloud base service with AWS and HDFS, worked on Installation of Hadoop and
different components(HDFS, Map Reduce, Pig, Hive, NoSQL DBs like MongoDB
etc).
·
Perform
Data analysis and understand Business requirements.
·
Wrote
Map Reduce Jobs and processed billions of records for analysis.
·
Worked with large scale
Hadoop environments build and support including design, performance tuning
and monitoring.
·
Implemented
File validation and File management system for Informatica to process.
·
Manage
and Monitor Hadoop Distribution using Ganglia and Puppet ED.
·
Create,
update and maintain project documents including business requirements,
functional and non-functional requirements, functional design, data mapping,
etc.
·
Identify
opportunities for process optimization, process redesign and development of
new process.
Environment: Hadoop, Map
Reduce, Pig, Hive, HP Vertica Database, Squirrel SQL
Client: PG&E,
San Ramon,
CA
Oct 10 – June 11
Sr Informatica Consultant
Description: With the onset of
the new Market Redesign Technology Upgrade, the Front Office has determined a
need to gather and analyze market data from various sources within CAISO and
other PG&E internal applications in support of least cost dispatch. In
order to accomplish this task, we are working towards an aggregation of data
from various resources to provide daily, weekly and quarterly analysis on the
new Markets and PG&E’s Portfolio Optimization & Trading
strategies. The Project will be known as ETO (Energy Trading and
Optimization).
I was Responsible for -
·
Design
and Development of ETL routines using Informatica Power Center and data flow
management into multiple targets.
·
Worked
on Information gathering from different sources within PGE for implementing
ETO Project.
·
Participate
in BI and Database product
evaluations, POC's and business decisions.
·
Analysed
different source like XML, Database, EXCEL based existing system to
extract and aggregate to new system and experience with Agile Methodology.
·
Worked
as Informatica Administrator and performed Admin Activity e.g Install, Patch,
creating environment.
·
Worked
in Initial implementation of Informatica Power Center 9, Shell Scripts
and other tools.
·
Implemented
file management systems for XMLs and Unstructured EXCEL Files.
·
Extracted
data from XML system for Master and Details record set and extablished
one to many relationship in database using Informatica while populating in
database.
·
Contributed
on doing the technical documents for dimenstions and Fact tables.
·
Wrote Test
Plan and Script at high level, tested the build of Informatica, Scripts.
·
Worked
on Performance Improvement and tunned mapping, Session and Workflow
for better performances.
Environment: Oracle 10g,
PL/SQL, Informatica Power Center 9.0.1, Business Object reports, XML
Source, Toad, Linux
Client: HCL- Axon,
Miami
Jan 10 – Oct 10
Hadoop Consultant
Description:
HCL Axon was implementing project for all
Caribbean island Telecommunication LIME legacy data to SAP System and do the
Data warehouse implementation for reporting purpose. The Data is related to
sales and marketing which needed to bring back in Data Warehouse thru ETL
Process.
I was Responsible for -
·
Done an
implementation of ECTL process for existing legacy system to move
data.
·
Identify opportunities for process
optimization, process redesign and
development of new process.
·
Leading
BI Team and Design the Data model for Sales, and implemented shell Scripts.
·
Evaluated tools and
worked on POC for Big Data implementation on Hadoop environment setup.
·
Worked on Map Reduce
Jobs, Hive, Hbase, Pig and NoSQL- MongoDB.
·
Designed
the Star Schema, Physical and Logical Model. Also the Parameter tables and
control tables.
·
Responsible
for the design, development, testing and documentation of the Informatica
mappings, Transformation and Workflows based on LIME standards.
·
Finally
populating the data to datawarehouse in Oracle thru Informatica PC 8.6.
Participated on End to end development and involved in decision making
process, different meetings and conf calls.
·
Designed Mappings for Stage area and responsible to
develop and analyze the requirement work out with the solution.
Environment: Oracle 10g,
PL/SQL, Informatica Power Center 8.6.0, Hadoop, Hbase, Hive and Pig,
NoSQL DBs
Client: Commercial
BI Project, Genentech, Inc.
SSF
Aug 09 – Dec 09
Lead ETL Consultant
Description:
Genentech engages in research,
development, manufacture, and marketing of biotechnology products for serious
or life-threatening diseases. BI is responsible to manage the BI System
database for Genentech customers, clients and Products. When fully
Implemented the BI System (part of CIT Commercial Group) the business will be
able to generate the reports based on ETL code and helps high level
management on decision making.
I was Responsible for -
·
Design,
develop and implemented ECTL process for existing tables in Oracle to
move to DW. Performed data-orientred tasks on Master Data especially Patient,
customers like standardizing, cleansing, merging, de-duping rules
along with UAT in each state.
·
Responsible
for the design, development, testing and documentation of the Informatica
mappings, Transformation and Workflows based on Gene
standards.
·
Identify opportunities for process
optimization, process redesign and
development of new process.
·
Initiate,
define, manage implementation and enforce data QA processes across,
Interact with other Team. Interacted with data quality team. Finally
populating the data to datawarehouse in Oracle thru Informatica PC 8.1.
Performance tunning in Informatica, Oracle SQL.
·
Used
Informatica to populate Oracle Table from Flat File System BI
System.
·
Designed Mappings for Landing, CADS, CDM area and
responsible to develop and analyze the requirement work out with the
solution. Tuning the SQL Queries, Mapping and PL/SQL Blocks.
·
Done
Incremental Load in Dimention tables and worked in highly normalised schema.
Used Stage work and Dw table concept to load data, applied Start Schema
concept. Used Informatica to do scheduling the workflow thru workflow
manager.
Environment: Oracle 9i,
PL/SQL, Informatica Power Center 8.1.6, UNIX Shell Script, BO.
Client: Oracle
Consulting Services, Costa Mesa,
CA
Jan 09 – Jul 09
Team Lead
Description:
Oracle Consulting Services is working to
implement new system and interface for Auto Club. Automobile Club Enterprises
wanted to migrate their Insurance payment system from Mainframe system to
Oracle R12. This migration requires extracting payment details information
GDG COBOL file and using Informatica to populate Oracle R12 Tables and
distribute the file to internal system and corresponding bank. ETL Process
will also record the errors and send the successful transmission back to the
System for confirmation. A Complete implementation will enable the system to
operate on Oracle R12 on go live and retire the mainframe system for
Payments; feed the data warehouse system in Teradata.
I was Responsible for -
·
Responsible
for preparing the BRD, technical specifications from the business
requirements and participated to develop
ETL environment, processes, programs, and scripts to acquire data from source
systems to populating to target system followed by feeding to data warehouse.
·
Analysis
of the Business and documenting it for development. Creating Tables and used
Informatica to populate Oracle Table from Cobol GDG file System EBS
System.
·
Mentored
the team
of 3 Informatica Developer and prepared them for development.
·
Worked
in different tool in OBIEE like Admin Tool, Answers,Dashboards, Pivot
Tables etc.
·
Created
Physical and Logical Layer extensively worked in different layers in
OBIEE along with reports. Designing Start Schema Model. Extract and load DB2
tables for a customer.
·
Wrote
Teradata BTEQ Scripts, Fast Load, Mload etc in order to support the project.
·
Performed
Administrative talk as Informatica Administrator and participated on
data-oriented tasks on Master Data projects especially members/Payment, like
standardizing, cleansing, merging, de-duping rules along with UAT in
each state.
·
Responsible
for the design, development, testing and documentation of the Informatica
mappings, Reports and Workflows based on AAA standards.
Implemented DAC in this Project to schedule.
Environment: Oracle R12,
PL/SQL, Informatica Power Center 8.6, Oracle EBS, OBIEE, Answers,
Dashboard, DAC, DB2 Source and target, Cobol Copy Book.
Client:
Wells Fargo Bank, San Francisco,
CA
Apr ’08 – Dec’ 08
Sr ETL Consultant
Description:
The Problem of
Obtaining customer and account information directly from mainframe
operational systems, affects mainframe utilization, operational system
performance, size of the mainframe environment and availability of
mainframe. The objective of this project is to improve availability of
key business data and processes by replicating the static or near static data
from those SORs into a non-mainframe database. A successful implementation
will reduce mainframe load and in the long run will save money by not having
to constantly invest in increasing mainframe capacity including best
available quality of data and controlling it and populate it to Data Warehouse
database going forward.
I was Responsible
for -
·
Provide
technical guidance to programming team on overall methodology, practices, and
procedures for support and development to ensure understanding of
standards, profiling and cleaning process and methodology.
·
Interact
with the team to facilitate development, provide data quality reports, and
perform software migration activities and accumulated the data for B2B
solution.
·
Worked
in Informatica 8.x to create and deploy the business rules to polulate
data into tables.
·
Created
Snapshots, Summary tables and views in Database to reduce the system overhead
and provide best quality of data for report, worked on cash management and
configuration of DAC.
·
Creation
of presentation layer tables by dragging appropriate BMM layer logical
columns in OBIEE.
·
Developed
Global prompts, Filters, Customized Reports and Dashboards
·
Implemented
Pivot Tables and manipulated Init Blocks for report analysis.
·
Perform
data quality analysis, standardization and validation, and develop data quality
metrics.
·
Insuraing
the data quality in Source and Target levels ot generate proper data report
and profiling.
·
Provide
overall direction and guidance to ETL development and support for the
Prescription Solutions’ Data Mart. Applying Velocity Best Practice for
the Project work.
·
Created
and used reusable Mapplets and transformations using Informatica
Power Center.
·
Responsible
for the design, development, testing and documentation of the Informatica
mapping,
Environment: Windows, UNIX,
Informatica 8.x, Oracle 10g, SQL, UNIX Shell Script, OBIEE, Answers, Admin
tool.
Client: PayPal, San Jose,
CA
July’07
–Mar’08
Senior Data
Warehouse ETL Developer
Description:
·
ECTL of
Oracle tables to Teradata using Informatica and Bteq Scripts. Migrated
SAS code to Teradata Bteq Scripts.
·
Predict
current high value sellers who are likely to churn based on their
similarities to past high value sellers. Identify dollar opportunity by
reducing churn of high value sellers.
I was Responsible
for -
·
Responsible
for preparing the technical specifications from the business
requirements.
·
Analyze the requirement work out with the solution.
Develop and maintain the detailed project documentation.
·
Used
Informatica and generated Flat file to load the data from Oracle to Teradata
and BTEQ/Fast Load Scripts to do incremental load. Used Stage work and
Dw table concept to load data, applied Start Schema concept. Created
UDFs in JAVA Transformation to complete some tasks.
·
Design,
develop and implemented ECTL process for Marketing Team for existing
tables in Oracle. Wrote BTEQ Scripts in order to support the project.
·
Used
version control system to manage code in different code streams like Clear
case.
·
Performed
data-oriented tasks on Master Data projects especially Customer/Party, like
standardizing, cleansing, merging, de-duping, determining rules.
·
Responsible
for the design, development, testing and documentation of the Informatica
mappings, PL/SQL, Transformation, jobs based on Paypal
standards.
·
Initiate,
define, manage implementation and enforce DW data QA processes across,
Interact with other QA Team. Interacted with data quality team.
·
Identify opportunities for process
optimization, process redesign and
development of new process.
·
Anticipate
& resolve data integration issues across applications and analyze data
sources to highlight data quality issues. Did performance and analysis
for Teradata Scripts.
·
Migrate
SAS Code
to Teradata BTEQ Scripts to do the scoreing for score taking in
account various parameters like login details, transaction $ amount etc.
Playing with Marketing Data for various reports.
Environment: Oracle 9i,
Informatica PC 8.1, PL/SQL, Teradata V2R6, Teradata SQL Assistant,
Fast load, BTEQ Script, SAS Code, Clear case, JAVA Language,
Pearl Scripts, XML Source.
Client:
Charles Schwab, SLDM, Phoenix,
CA
Oct’06 –
June’ 07
Informatica and ETL
Lead
Description:
·
The
SchwabLink file delivery process is one of the cornerstone technology
offerings of Schwab Institutional (SI). Effective portfolio management can
not be performed without data. Financial advisors doing business with SI rely
upon the SchwabLink file delivery process to obtain daily updates to
customer, account, balance, position, and other important data required for
portfolio management.
·
The
file delivery process harvests Schwab broker/dealer data for use in creating
download files. In addition, data is received from external sources including
U.S. Trust, The Charles Schwab Trust Company, The Charles Schwab Bank, and
Great Western. In total, nearly 4G of data is received daily for
processing and delivery of more the 68,000 individual files – with 79
different file formats – to advisors using the SchwabLink file download
mechanism.
I was Responsible
for -
·
Design
Road map for ETL implementation and worked with ETL Team.
·
Developed
reusable mappings and mapplets to acomplish similar business rules using
various transformations. Configuring VSAM Files as source.
·
Designed
the process of Parameter file generation from Informatica.
·
Used
the Shell Scripts for running the Workflows and Normalised the process.
·
Checkin-Check-out
of the mapping and locks checking.
·
Created
and used reusable Mapplets and transformations using Informatica
Power Center.
·
Responsible
for the design, development, testing and documentation of the Informatica
mapping, sessions and workflows based on Schwab standards and specifications.
·
Created
Scripts to populate Teradata tables and used BTEQ Scripts along with
Teradata utilities.
·
Executed
several sessions and created batches to automate the loading process.
·
Loaded
the data from VSAM file to Oracle and also created flat files.
·
Generated
the XML Source/Target for specific metadata requirements.
Environment: Oracle 10g, Informatica
PC 8.1.1, Clear Case 7.0, Toad, COBOL Files, Teradata Source, Target.
DBS
Bank,
Singapore
Aug
‘05 – Sep ‘06
Senior Data
Warehouse Consultant
Description:
The
Data Warehouse programme will enable the identification and realization of
improvements in the areas of cross-selling & market share growth, risk
management, operational excellence and finance & regulatory reporting by
defining and building the Enterprise Data Warehouse providing accurate,
consistent, timely and shared customer centric information for Singapore and
Hong Kong.
This
project is about the business’s requirement for consolidated, timely,
standardized and cleansed data, globally available on a timely basis to
ultimately answer business questions for identifying and realizing
improvements across the business lines and geographies. It is an
“enterprise-wide” initiative that addresses the lack of quality and timely
customer information, the need to merge more and more data from currently
separate sources and to establish standards for wide range of information
used within the bank.
I was Responsible
for -
·
Working
as Team lead Involved in the development Team for setting the
standards for Informatica developers in project.
·
Communicate
with immediate supervisor on the status of the work in-progress, both verbal
and written communications.
·
To
Support whole systems, TWS daily run and Daily Backup for whole systems.
·
Responsible
for designing the ETL source map and transformation logic.
·
Developed
Teradata script for FastLoad, MultiLoad, FastExport scripts
·
Created
DSNs and Database connections to support Informatica Architecture.
·
Involved
in working with various active and passive transformations.
·
Created
workflows and sessions to perform data loading.
·
Designed
Test-Cases and validation scripts to test the loaded data.
·
Tuned
the mapping, data flow and session property sheet to optimize the
performance.
·
Worked
in Star schema and Snowflake schema design.
·
Written
Shell Scripts to run Informatica Workflows and Backup.
Environment: Windows, IBM AIX, Teradata
V2R5, Informatica PC 7.1.2 (Designer, Repository Manager, Workflow
Manager, Monitor), Tivoli Workload Scheduler, Teradata Utilities like Fast
Load, Multi Load, Fast Export, SQL.
Hewlett-Packard,
Malaysia,
Singapore
May’04 –Jul’05
Team Lead /SME
Description:
To
create a tiered Data Warehouse Architecture based on the requirements.
Data from various data sources will be consolidated into different business
subject areas within the Data Warehouse layer. This will ensure semantic
integrity for common attributes across different data marts and eliminate
redundancy in data extraction, transformation and consolidation.
I was Responsible
for -
·
Designed
the tables and created the mappings to populate data.
·
Used
control table and Parameter files to run the mappings
·
Written
shell scripts in order to take repository backup and run workflows.
·
Implemented
complex mappings
using expressions, aggregators, filters, joiner, rank, union, sequence
generator and procedures. Populated Teradata Tables for DW
Implementations.
·
Developed
and tested ETL procedures to ensure conformity, compliance with
standards and lack of redundancy, translated business rules and functionality
requirements into ETL procedures.
·
Created
Parameter Files and also done the version control thru Clear Case.
·
Improved
performance by identifying the bottlenecks in Source, Target, Mapping and
Session levels.
·
Used
Bteq Scripts.
Implemented the Change requests, Involved in testing the mappings/Documents.
Environment: Informatica PC
6.2, Oracle 8i/9i, Cognos Report Studio, Toad, SQL
Informatica Repository Manager, Designer, Unix Shell Script, Windows 2000/NT, Teradata V2R1 , Rational Rose Clear
Case.
GE
Medical Systems,Bangalore,
India
May ‘03 - Apr ‘04
ETL Consultant
Description:
The
Customer wanted to integrate their Data from all origin, which is Legacy and
Oracle. After the Integration, we apply business logics and use the
Informatica sessions to pull the data to Target system and this data can be
used to generate Cognos Cubes and Reports.
I was Responsible
for -
·
Done
Various Loads like Daily, Weekly and Monthly Load based upon the
request.
·
Automated
the Process of Extraction for some Origin.
·
Developed
various Mappings to implement business logic.
·
Used
the Shell Scripts for running the session.
·
Used
Update strategy and Target load plans to load data into Type-2 Dimensions.
·
Created
and used reusable Mapplets and transformations using Informatica Power
Center.
·
Improved
performance by identifying the bottlenecks in Source, Target, Mapping and
Session levels.
·
Created
Snapshots, Summary tables and materialized views in Database to reduce the
system overhead.
·
Design
and Development of ETL routines, using Informatica Power Center Within the Informatica
Mappings, usage of Lookups, Aggregator, Ranking, Stored procedures /
functions, SQL overrides usage in Lookups and source filter usage in
Source qualifiers and data flow management into multiple targets using
Routers was extensively done.
·
Executed
several sessions and created batches to automate the loading process.
·
Involved
in testing the developed mappings and created the test cases.
·
Loaded
the data from .CSV file to Oracle and Teradata.
Environment: Informatica
Power Center 6.2, Oracle 8i, Shell Scripts, Teradata, Teradata
Utilities like Fast Load, Multi Load, Fast Export, SQL.
Worked with
different companies as Developer and Consultant – Jan’ 99 thru Apr’03.
3